Specialize `PartialOrd<A> for [A] where A: Ord` #39642

ghost · 2017-02-08T11:35:30Z

This way we can call cmp instead of partial_cmp in the loop, removing some burden of optimizing Options away from the compiler.

PR #39538 introduced a regression where sorting slices suddenly became slower, since slice1.lt(slice2) was much slower than slice1.cmp(slice2) == Less. This problem is now fixed.

To verify, I benchmarked this simple program:

fn main() {
    let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>();
    v.sort();
}

Before this PR, it would take 0.95 sec, and now it takes 0.58 sec.
I also tried changing the is_less lambda to use cmp and partial_cmp. Now all three versions (lt, cmp, partial_cmp) are equally performant for sorting slices - all of them take 0.58 sec on the
benchmark.

Tangentially, as soon as we get default impl, it might be a good idea to implement a blanket default impl for lt, gt, le, ge in terms of cmp whenever possible. Today, those four functions by default are only implemented in terms of partial_cmp.

r? @alexcrichton

This way we can call `cmp` instead of `partial_cmp` in the loop, removing some burden of optimizing `Option`s away from the compiler. PR #39538 introduced a regression where sorting slices suddenly became slower, since `slice1.lt(slice2)` was much slower than `slice1.cmp(slice2) == Less`. This problem is now fixed. To verify, I benchmarked this simple program: ```rust fn main() { let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>(); v.sort(); } ``` Before this PR, it would take 0.95 sec, and now it takes 0.58 sec. I also tried changing the `is_less` lambda to use `cmp` and `partial_cmp`. Now all three versions (`lt`, `cmp`, `partial_cmp`) are equally performant for sorting slices - all of them take 0.58 sec on the benchmark.

ollie27 · 2017-02-08T14:36:56Z

src/libcore/slice.rs

+        self.len().partial_cmp(&other.len())
+    }
+}
+
 impl SlicePartialOrd<u8> for [u8] {


Could this impl not be extended to all A: Ord so it would simply use Some(SliceOrd::compare(self, other))?

That's a good suggestion, thanks! I've updated the code.

alexcrichton · 2017-02-08T17:00:11Z

src/libcore/slice.rs

+        Some(SliceOrd::compare(self, other))
+    }
+}
+
 impl SlicePartialOrd<u8> for [u8] {


I think this specialization could be removed now, right?

Yeah, removed.

alexcrichton · 2017-02-08T17:00:54Z

Looks good to me, thanks @stjepang!

cc @rust-lang/libs, @bluss, I'm sure this has the likelihood of being super subtle, so would be good to get some more eyes on this as well

alexcrichton · 2017-02-10T16:42:54Z

@bors: r+

bors · 2017-02-10T16:42:55Z

📌 Commit a344c12 has been approved by alexcrichton

bors · 2017-02-11T04:37:31Z

⌛ Testing commit a344c12 with merge f140a6c...

@alexcrichton

…ichton Specialize `PartialOrd<A> for [A] where A: Ord` This way we can call `cmp` instead of `partial_cmp` in the loop, removing some burden of optimizing `Option`s away from the compiler. PR #39538 introduced a regression where sorting slices suddenly became slower, since `slice1.lt(slice2)` was much slower than `slice1.cmp(slice2) == Less`. This problem is now fixed. To verify, I benchmarked this simple program: ```rust fn main() { let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>(); v.sort(); } ``` Before this PR, it would take 0.95 sec, and now it takes 0.58 sec. I also tried changing the `is_less` lambda to use `cmp` and `partial_cmp`. Now all three versions (`lt`, `cmp`, `partial_cmp`) are equally performant for sorting slices - all of them take 0.58 sec on the benchmark. Tangentially, as soon as we get `default impl`, it might be a good idea to implement a blanket default impl for `lt`, `gt`, `le`, `ge` in terms of `cmp` whenever possible. Today, those four functions by default are only implemented in terms of `partial_cmp`. r? @alexcrichton

@alexcrichton

…d, r=alexcrichton Specialize `PartialOrd<A> for [A] where A: Ord` This way we can call `cmp` instead of `partial_cmp` in the loop, removing some burden of optimizing `Option`s away from the compiler. PR rust-lang#39538 introduced a regression where sorting slices suddenly became slower, since `slice1.lt(slice2)` was much slower than `slice1.cmp(slice2) == Less`. This problem is now fixed. To verify, I benchmarked this simple program: ```rust fn main() { let mut v = (0..2_000_000).map(|x| x * x * x * 18913515181).map(|x| vec![x, x ^ 3137831591]).collect::<Vec<_>>(); v.sort(); } ``` Before this PR, it would take 0.95 sec, and now it takes 0.58 sec. I also tried changing the `is_less` lambda to use `cmp` and `partial_cmp`. Now all three versions (`lt`, `cmp`, `partial_cmp`) are equally performant for sorting slices - all of them take 0.58 sec on the benchmark. Tangentially, as soon as we get `default impl`, it might be a good idea to implement a blanket default impl for `lt`, `gt`, `le`, `ge` in terms of `cmp` whenever possible. Today, those four functions by default are only implemented in terms of `partial_cmp`. r? @alexcrichton

bors · 2017-02-11T07:18:16Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing f140a6c to master...

rust-highfive assigned alexcrichton Feb 8, 2017

ollie27 reviewed Feb 8, 2017

View reviewed changes

Simplify by calling SliceOrd::compare

ececbb2

alexcrichton reviewed Feb 8, 2017

View reviewed changes

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Feb 8, 2017

Remove unnecessary specialization for [u8]

a344c12

frewsxcv mentioned this pull request Feb 11, 2017

Rollup of 8 pull requests #39735

Closed

bors merged commit a344c12 into rust-lang:master Feb 11, 2017

ghost deleted the specialize-slice-partialord branch February 11, 2017 09:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialize `PartialOrd<A> for [A] where A: Ord` #39642

Specialize `PartialOrd<A> for [A] where A: Ord` #39642

ghost commented Feb 8, 2017

ollie27 Feb 8, 2017

ghost Feb 8, 2017

alexcrichton Feb 8, 2017

ghost Feb 8, 2017

alexcrichton commented Feb 8, 2017

alexcrichton commented Feb 10, 2017

bors commented Feb 10, 2017

bors commented Feb 11, 2017

bors commented Feb 11, 2017

Specialize PartialOrd<A> for [A] where A: Ord #39642

Specialize PartialOrd<A> for [A] where A: Ord #39642

Conversation

ghost commented Feb 8, 2017

ollie27 Feb 8, 2017

Choose a reason for hiding this comment

ghost Feb 8, 2017

Choose a reason for hiding this comment

alexcrichton Feb 8, 2017

Choose a reason for hiding this comment

ghost Feb 8, 2017

Choose a reason for hiding this comment

alexcrichton commented Feb 8, 2017

alexcrichton commented Feb 10, 2017

bors commented Feb 10, 2017

bors commented Feb 11, 2017

bors commented Feb 11, 2017

Specialize `PartialOrd<A> for [A] where A: Ord` #39642

Specialize `PartialOrd<A> for [A] where A: Ord` #39642